{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Tensorflow的常见损失函数"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "损失函数（loss function）：\n",
    "* 用来评价模型的预测值和真实值不一样的程度，损失函数越好，通常模型的性能越好。\n",
    "* 不同的模型用的损失函数一般是不一样的\n",
    "* 损失函数是编译模型时所需的两个参数之一：model.compile(loss='mean_squared_error', optimizer='sgd')\n",
    "\n",
    "\n",
    "损失函数为每个数据点返回一个数字，有以下两个参数:  \n",
    "* y_true: 真实标签\n",
    "* y_pred: 预测值  \n",
    "\n",
    "代码演示内容：\n",
    "* 二分类：BinaryCrossentropy\n",
    "* 多分类：SparseCategoricalCrossentropy\n",
    "* 多分类：CategoricalCrossentropy\n",
    "* 回归：MSE\n",
    "* 自己实现损失函数MSE"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1. 二分类：BinaryCrossentropy\n",
    "\n",
    "* 用于二分类的场景，比如标签值为0和1\n",
    "* 计算真实标签和预测标签之间的交叉熵损失\n",
    "* model.compile(optimizer='sgd', loss=tf.keras.losses.BinaryCrossentropy())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 这是一个batch，有4个样本\n",
    "y_true = np.array([[0], [1], [0], [1]])\n",
    "y_pred = np.array([[0.8], [0.7], [0.6], [0.5]])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(4, 1)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y_true.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(4, 1)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y_pred.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.8938874006271362"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 默认输出整个batch的loss的平均值\n",
    "bce = tf.keras.losses.BinaryCrossentropy()\n",
    "bce(y_true, y_pred).numpy()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1.60943747, 0.35667479, 0.91629046, 0.693147  ])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 默认输出整个batch的loss的平均值\n",
    "bce = tf.keras.losses.BinaryCrossentropy(reduction='none')\n",
    "bce(y_true, y_pred).numpy()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2. 多分类：SparseCategoricalCrossentropy\n",
    "\n",
    "* 用于多分类的场景\n",
    "* 当y_true是[0, 1, 2]这样的数字的时候用这个函数\n",
    "* 计算标签和预测之间的交叉熵损失\n",
    "* model.compile(optimizer='sgd', loss=tf.keras.losses.SparseCategoricalCrossentropy())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 三分类，则标签值范围是[0, 1, 2]\n",
    "# 对于batch为2的数据\n",
    "y_true = [1, 2]\n",
    "\n",
    "# y_pred其实是预测值在多分类的概率分布\n",
    "y_pred = [[0.05, 0.95, 0], [0.1, 0.8, 0.1]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.1769392"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "loss_func = tf.keras.losses.SparseCategoricalCrossentropy()\n",
    "loss_func(y_true, y_pred).numpy()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.05129344, 2.3025851 ], dtype=float32)"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 默认情况下，一个batch只会输出一个平均后的数字，指定reduction='none'则每个样本输出数字\n",
    "loss_func = tf.keras.losses.SparseCategoricalCrossentropy(reduction='none')\n",
    "loss_func(y_true, y_pred).numpy()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3. 多分类：CategoricalCrossentropy\n",
    "\n",
    "* 用于多分类的场景\n",
    "* 当y_true是[[1,0,0], [0,1,0], [0,0,1]]这样的one-hot编码的时候\n",
    "* 即y_true的shape和y_pred的shape是相同的"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "# batch_size = 2\n",
    "\n",
    "# y true现在是one-hot编码\n",
    "y_true = [[0, 1, 0], [0, 0, 1]]\n",
    "# y pred依然是概率分布\n",
    "y_pred = [[0.05, 0.95, 0], [0.1, 0.8, 0.1]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.1769392"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "loss_func = tf.keras.losses.CategoricalCrossentropy()\n",
    "loss_func(y_true, y_pred).numpy()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.05129331, 2.3025851 ], dtype=float32)"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "loss_func = tf.keras.losses.CategoricalCrossentropy(reduction='none')\n",
    "loss_func(y_true, y_pred).numpy()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4. 回归：MSE\n",
    "\n",
    "* 回归问题中最常用的损失函数\n",
    "* 计算标签和预测之间的均方误差\n",
    "* 实现函数：loss = mean(square(y_true - y_pred), axis=-1)\n",
    "* 使用：model.compile(optimizer='sgd', loss=tf.keras.losses.MeanSquaredError())\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "# batch_size = 2\n",
    "y_true = [[0., 1.], [0., 0.]]\n",
    "y_pred = [[1., 1.], [1., 0.]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.5"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 每个batch输出自己的loss\n",
    "loss = tf.keras.losses.MeanSquaredError()\n",
    "loss(y_true, y_pred).numpy()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.5, 0.5], dtype=float32)"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 每个batch输出自己的loss\n",
    "loss = tf.keras.losses.MeanSquaredError(reduction='none')\n",
    "loss(y_true, y_pred).numpy()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 5. 自己实现损失函数MSE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"./images/mse.svg\" style=\"margin-left:0px\"/>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "# batch_size = 2\n",
    "y_true = np.array([[0., 1.], [0., 0.]])\n",
    "y_pred = np.array([[1., 1.], [1., 0.]])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "def loss(y_pred, y_true):\n",
    "    # 相减、平方、平均\n",
    "    return tf.reduce_mean(tf.square(y_pred - y_true))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.5"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "loss(y_pred, y_true).numpy()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
